æç¥šåé¡åšãçšããã¢ãã«ã¢ã³ãµã³ãã«ã®åãæ¢æ±ãè€æ°ã®æ©æ¢°åŠç¿ã¢ãã«ãçµã¿åããã倿§ãªå¿çšåéã§ç²ŸåºŠãšå ç¢æ§ãåäžãããæ¹æ³ãåŠã³ãŸããããå®çšçãªç¥èŠãšã°ããŒãã«ãªèŠç¹ãåŸãããŸãã
ã¢ãã«ã¢ã³ãµã³ãã«ããã¹ã¿ãŒããïŒæç¥šåé¡åšã®ç·åã¬ã€ã
çµ¶ããé²åããæ©æ¢°åŠç¿ã®åéã«ãããŠãé«ã粟床ãšå ç¢ãªããã©ãŒãã³ã¹ãéæããããšã¯æãéèŠã§ããã¢ãã«ã®æ§èœãåäžãããããã®æã广çãªææ³ã®äžã€ããã¢ãã«ã¢ã³ãµã³ãã«ã§ãããã®ã¢ãããŒãã¯ãè€æ°ã®åå¥ã¢ãã«ã®äºæž¬ãçµã¿åãããããšã§ããã匷åã§ä¿¡é Œæ§ã®é«ãã¢ãã«ãæ§ç¯ããŸãããã®ç·åã¬ã€ãã§ã¯ãã¢ãã«ã¢ã³ãµã³ãã«ã®äžçãæ·±ãæãäžããç¹ã«æç¥šåé¡åšã«çŠç¹ãåœãŠããã®ä»çµã¿ãå©ç¹ãå®è·µçãªå®è£ ã«ã€ããŠè©³ãã解説ããŸããæ¬ã¬ã€ãã¯ãäžçäžã®èªè ãã¢ã¯ã»ã¹ã§ããããã倿§ãªå°åãå¿çšåéã«é¢é£ããç¥èŠãäŸãæäŸããããšãç®æããŠããŸãã
ã¢ãã«ã¢ã³ãµã³ãã«ã®çè§£
ã¢ãã«ã¢ã³ãµã³ãã«ã¯ãè€æ°ã®æ©æ¢°åŠç¿ã¢ãã«ã®é·æãçµã¿åãããæè¡ã§ããç¹å®ã®ãã€ã¢ã¹ããšã©ãŒã«é¥ããããåäžã®ã¢ãã«ã«é Œãã®ã§ã¯ãªããã¢ã³ãµã³ãã«ã¯è€æ°ã®ã¢ãã«ã®éåç¥ã掻çšããŸãããã®æŠç¥ã¯ã粟床ãå ç¢æ§ãæ±åèœåã®é¢ã§ãã°ãã°å€§å¹ ãªæ§èœåäžã«ã€ãªãããŸããåã ã®ã¢ãã«ã®åŒ±ç¹ãå¹³ååããããšã§ãéåŠç¿ã®ãªã¹ã¯ã軜æžããŸããã¢ã³ãµã³ãã«ã¯ãåã ã®ã¢ãã«ã倿§ã§ããå Žåãã€ãŸãç°ãªãã¢ã«ãŽãªãºã ãåŠç¿ããŒã¿ã®ãµãã»ããããŸãã¯ç¹åŸŽéã»ããã䜿çšããŠããå Žåã«ç¹ã«å¹æçã§ãããã®å€æ§æ§ã«ãããã¢ã³ãµã³ãã«ã¯ããŒã¿å ã®ããåºç¯ãªãã¿ãŒã³ãé¢ä¿æ§ãæããããšãã§ããŸãã
ã¢ã³ãµã³ãã«ææ³ã«ã¯ã以äžã®ãããªçš®é¡ããããŸãïŒ
- ãã®ã³ã°ïŒããŒãã¹ãã©ããéçŽæ³ïŒïŒ ãã®ææ³ã¯ãåŸ©å æœåºïŒããŒãã¹ãã©ããïŒã«ãã£ãŠäœæãããåŠç¿ããŒã¿ã®ç°ãªããµãã»ããã§è€æ°ã®ã¢ãã«ãåŠç¿ãããŸãã代衚çãªãã®ã³ã°ã¢ã«ãŽãªãºã ã«ã¯ã©ã³ãã ãã©ã¬ã¹ãããããŸãã
- ããŒã¹ãã£ã³ã°ïŒ ããŒã¹ãã£ã³ã°ã¢ã«ãŽãªãºã ã¯ãã¢ãã«ã鿬¡çã«åŠç¿ãããåŸç¶ã®ã¢ãã«ãå è¡ããã¢ãã«ã®ãšã©ãŒãä¿®æ£ããããšããŸããäŸãšããŠã¯ãAdaBoostãåŸé ããŒã¹ãã£ã³ã°ãXGBoostããããŸãã
- ã¹ã¿ããã³ã°ïŒã¹ã¿ãã¯äžè¬åïŒïŒ ã¹ã¿ããã³ã°ã¯ãè€æ°ã®ããŒã¹ã¢ãã«ãåŠç¿ããããã®äºæž¬ãçµã¿åãããããã«å¥ã®ã¢ãã«ïŒã¡ã¿åŠç¿åšãŸãã¯ãã¬ã³ããŒïŒã䜿çšããŸãã
- æç¥šïŒ ãã®ã¬ã€ãã®çŠç¹ã§ããæç¥šã¯ã倿°æ±ºïŒåé¡ã®å ŽåïŒãŸãã¯å¹³ååïŒååž°ã®å ŽåïŒã«ãã£ãŠè€æ°ã®ã¢ãã«ã®äºæž¬ãçµã¿åãããŸãã
æç¥šåé¡åšã®è©³çް
æç¥šåé¡åšã¯ãè€æ°ã®åé¡åšã®äºæž¬ãçµã¿åãããç¹å®ã®ã¿ã€ãã®ã¢ã³ãµã³ãã«ææ³ã§ããåé¡ã¿ã¹ã¯ã§ã¯ãæçµçãªäºæž¬ã¯éåžžã倿°æ±ºã«ãã£ãŠæ±ºå®ãããŸããäŸãã°ã3ã€ã®åé¡åšãããããã¯ã©ã¹AãBãAãšäºæž¬ããå Žåãæç¥šåé¡åšã¯ã¯ã©ã¹Aãšäºæž¬ããŸããæç¥šåé¡åšã®ã·ã³ãã«ããšæå¹æ§ã«ãããæ§ã ãªæ©æ¢°åŠç¿ã¢ããªã±ãŒã·ã§ã³ã§äººæ°ã®ããéžæè¢ãšãªã£ãŠããŸããå®è£ ãæ¯èŒç容æã§ãããåã ã®åé¡åšãåç¬ã§äœ¿çšããå Žåãšæ¯èŒããŠããã°ãã°ã¢ãã«æ§èœã®å€§å¹ ãªåäžã«ã€ãªãããŸãã
æç¥šåé¡åšã«ã¯ãäž»ã«2ã€ã®ã¿ã€ãããããŸãïŒ
- ããŒãæç¥šïŒ ããŒãæç¥šã§ã¯ãååé¡åšãç¹å®ã®ã¯ã©ã¹ã©ãã«ã«æç¥šããŸããæçµçãªäºæž¬ã¯ãæãå€ãã®ç¥šãåŸãã¯ã©ã¹ã©ãã«ã«ãªããŸããããã¯çè§£ããããå®è£ ãç°¡åãªãçŽæ¥çãªã¢ãããŒãã§ãã
- ãœããæç¥šïŒ ãœããæç¥šã¯ãååé¡åšããã®åã¯ã©ã¹ã®äºæž¬ç¢ºçãèæ ®ããŸããçŽæ¥çãªæç¥šã®ä»£ããã«ãååé¡åšã®ã¯ã©ã¹ã«å¯Ÿãã確çãåèšããã確çã®åèšãæãé«ãã¯ã©ã¹ãæçµçãªäºæž¬ãšããŠéžæãããŸãããœããæç¥šã¯ãåã ã®åé¡åšã®ä¿¡é ŒåºŠã掻çšãããããããŒãæç¥šãããæ§èœãè¯ãããšãå€ãã§ããåºã«ãªãåé¡åšãç¢ºçæšå®å€ãæäŸã§ããããšïŒäŸïŒscikit-learnã®`predict_proba`ã¡ãœããã䜿çšïŒãéèŠã§ãã
æç¥šåé¡åšã䜿çšããå©ç¹
æç¥šåé¡åšã¯ããã®åºç¯ãªäœ¿çšã«è²¢ç®ããããã€ãã®éèŠãªå©ç¹ãæäŸããŸãïŒ
- 粟床ã®åäžïŒ è€æ°ã®ã¢ãã«ã®äºæž¬ãçµã¿åãããããšã§ãæç¥šåé¡åšã¯ãã°ãã°åã ã®åé¡åšãããé«ã粟床ãéæã§ããŸããããã¯ãåã ã®ã¢ãã«ã倿§ãªé·æãšçæãæã€å Žåã«ç¹ã«åœãŠã¯ãŸããŸãã
- å ç¢æ§ã®åäžïŒ ã¢ã³ãµã³ãã«ã¯ãå€ãå€ããã€ãºã®å€ãããŒã¿ã®åœ±é¿ã軜æžããã®ã«åœ¹ç«ã¡ãŸããäžã€ã®ã¢ãã«ãééããç¯ããå Žåã§ããä»ã®ã¢ãã«ããããè£ãããšãã§ãããããããå®å®ããŠä¿¡é Œæ§ã®é«ãäºæž¬ãå¯èœã«ãªããŸãã
- éåŠç¿ã®åæžïŒ æç¥šãå«ãã¢ã³ãµã³ãã«æè¡ã¯ãè€æ°ã®ã¢ãã«ã®äºæž¬ãå¹³ååããããšã§ãåã ã®ã¢ãã«ã®ãã€ã¢ã¹ã®åœ±é¿ãæ»ããã«ããéåŠç¿ãæžããããšãã§ããŸãã
- æ±çšæ§ïŒ æç¥šåé¡åšã¯ãæ±ºå®æšããµããŒããã¯ã¿ãŒãã·ã³ãããžã¹ãã£ãã¯ååž°ãªã©ãæ§ã ãªã¿ã€ãã®ããŒã¹åé¡åšãšå ±ã«äœ¿çšã§ããã¢ãã«èšèšã«æè»æ§ããããããŸãã
- å®è£ ã®å®¹æãïŒ scikit-learnã®ãããªãã¬ãŒã ã¯ãŒã¯ã¯ãæç¥šåé¡åšã®ç°¡åãªå®è£ ãæäŸããŠãããæ©æ¢°åŠç¿ãã€ãã©ã€ã³ãžã®çµã¿èŸŒã¿ã容æã«ããŸãã
PythonãšScikit-learnã«ããå®è·µçãªå®è£
Pythonãšscikit-learnã©ã€ãã©ãªã䜿çšããå®è·µçãªäŸã§ãæç¥šåé¡åšã®äœ¿ç𿹿³ã説æããŸããããåé¡ã«ã¯ãããç¥ãããã¢ã€ã¡ïŒIrisïŒããŒã¿ã»ããã䜿çšããŸãã以äžã®ã³ãŒãã¯ãããŒãæç¥šãšãœããæç¥šã®äž¡æ¹ã®åé¡åšã瀺ããŠããŸãïŒ
from sklearn.ensemble import RandomForestClassifier, VotingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
# Load the Iris dataset
iris = load_iris()
X = iris.data
y = iris.target
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Define individual classifiers
clf1 = LogisticRegression(random_state=1)
clf2 = RandomForestClassifier(random_state=1)
clf3 = SVC(probability=True, random_state=1)
# Hard Voting Classifier
eclf1 = VotingClassifier(estimators=[('lr', clf1), ('rf', clf2), ('svc', clf3)], voting='hard')
eclf1 = eclf1.fit(X_train, y_train)
y_pred_hard = eclf1.predict(X_test)
print(f'Hard Voting Accuracy: {accuracy_score(y_test, y_pred_hard):.3f}')
# Soft Voting Classifier
eclf2 = VotingClassifier(estimators=[('lr', clf1), ('rf', clf2), ('svc', clf3)], voting='soft')
eclf2 = eclf2.fit(X_train, y_train)
y_pred_soft = eclf2.predict(X_test)
print(f'Soft Voting Accuracy: {accuracy_score(y_test, y_pred_soft):.3f}')
ãã®äŸã§ã¯ïŒ
- `RandomForestClassifier`ã`LogisticRegression`ã`SVC`ã`VotingClassifier`ã`load_iris`ã`train_test_split`ã`accuracy_score`ãªã©ãå¿ èŠãªã©ã€ãã©ãªãã€ã³ããŒãããŸãã
- ã¢ã€ã¡ããŒã¿ã»ãããããŒãããåŠç¿çšãšãã¹ãçšã«åå²ããŸãã
- ããžã¹ãã£ãã¯ååž°ã¢ãã«ãã©ã³ãã ãã©ã¬ã¹ãåé¡åšãSVCïŒãµããŒããã¯ã¿ãŒåé¡åšïŒã®3ã€ã®åå¥ã®åé¡åšãå®çŸ©ããŸããSVCã®`probability=True`ãã©ã¡ãŒã¿ã«æ³šæããŠãã ãããããã¯ãåé¡åšãç¢ºçæšå®å€ãåºåã§ããããã«ããããããœããæç¥šã«ãšã£ãŠéèŠã§ãã
- `VotingClassifier`ã§`voting='hard'`ãæå®ããŠãããŒãæç¥šåé¡åšãäœæããŸããåã ã®ã¢ãã«ãåŠç¿ããã倿°æ±ºãçšããŠäºæž¬ãè¡ããŸãã
- `VotingClassifier`ã§`voting='soft'`ãæå®ããŠããœããæç¥šåé¡åšãäœæããŸãããããåã ã®ã¢ãã«ãåŠç¿ãããŸãããäºæž¬ã«ã¯ç¢ºçãçµã¿åãããŠäœ¿çšããŸãã
- ãã¹ãã»ããã§ããŒãæç¥šãšãœããæç¥šã®äž¡æ¹ã®åé¡åšã®ç²ŸåºŠãè©äŸ¡ããŸããäžè¬çã«ãæç¥šåé¡åšãç¹ã«ãœããæç¥šåé¡åšãåã ã®åé¡åšãäžåãæ§èœã瀺ãããšã芳å¯ã§ããã¯ãã§ãã
å®çšçãªç¥èŠïŒ ããŒã¹ãšãªãåé¡åšãç¢ºçæšå®å€ãæäŸã§ããå Žåã¯ãåžžã«ãœããæç¥šãæ€èšããŠãã ãããå€ãã®å ŽåãããåªããçµæãåŸãããŸãã
é©åãªããŒã¹åé¡åšã®éžæ
æç¥šåé¡åšã®æ§èœã¯ãããŒã¹ãšãªãåé¡åšã®éžæã«å€§ããäŸåããŸãã倿§ãªã¢ãã«ã®ã»ãããéžæããããšãéèŠã§ãã以äžã¯ãããŒã¹åé¡åšãéžæããããã®ã¬ã€ãã©ã€ã³ã§ãïŒ
- 倿§æ§ïŒ ã¢ã«ãŽãªãºã ãç¹åŸŽéã®äœ¿çšæ³ããŸãã¯åŠç¿ã¢ãããŒãã®ç¹ã§ç°ãªãåé¡åšãéžæããŠãã ããã倿§æ§ã«ãããã¢ã³ãµã³ãã«ã¯ããåºç¯ãªãã¿ãŒã³ãæããåãééããç¯ããªã¹ã¯ãæžããããšãã§ããŸããäŸãã°ãæ±ºå®æšããµããŒããã¯ã¿ãŒãã·ã³ãããžã¹ãã£ãã¯ååž°ã¢ãã«ãçµã¿åãããããšã¯è¯ãåºçºç¹ã§ãã
- æ§èœïŒ åããŒã¹åé¡åšã¯ãããèªäœã§åŠ¥åœãªæ§èœãæã€ã¹ãã§ããã¢ã³ãµã³ãã«åããŠãã匱ãåŠç¿åšãæ¹åããã®ã¯é£ããã§ãããã
- çžè£æ§ïŒ ç°ãªãåé¡åšãäºãã«ã©ãã ãè£å®ãåãããèæ ®ããŠãã ãããããåé¡åšãç¹å®ã®é åã§åŒ·ãå Žåãç°ãªãé åã§åªããŠããããŸãã¯ç°ãªãã¿ã€ãã®ããŒã¿ãæ±ãä»ã®åé¡åšãéžæããŸãã
- èšç®ã³ã¹ãïŒ æ§èœåäžãšèšç®ã³ã¹ãã®ãã©ã³ã¹ãåã£ãŠãã ãããè€éãªã¢ãã«ã¯ç²ŸåºŠãåäžããããããããŸããããåŠç¿æéãšäºæž¬æéãå¢å ãããŸããç¹ã«å€§èŠæš¡ãªããŒã¿ã»ããããªã¢ã«ã¿ã€ã ã¢ããªã±ãŒã·ã§ã³ãæ±ãéã«ã¯ããããžã§ã¯ãã®å®çšçãªå¶çŽãèæ ®ããŠãã ããã
- å®éšïŒ ç¹å®ã®åé¡ã«å¯ŸããŠæé©ãªã¢ã³ãµã³ãã«ãèŠã€ããããã«ãããŸããŸãªåé¡åšã®çµã¿åããã詊ããŠãã ãããæ€èšŒã»ããã§é©åãªã¡ããªã¯ã¹ïŒäŸïŒç²ŸåºŠãé©åçãåçŸçãF1ã¹ã³ã¢ãAUCïŒã䜿çšããŠæ§èœãè©äŸ¡ããŸãããã®å埩çãªããã»ã¹ãæåã«ã¯äžå¯æ¬ ã§ãã
æç¥šåé¡åšã®ãã€ããŒãã©ã¡ãŒã¿ãã¥ãŒãã³ã°
æç¥šåé¡åšããã³åã ã®ããŒã¹åé¡åšã®ãã€ããŒãã©ã¡ãŒã¿ã埮調æŽããããšã¯ãæ§èœãæå€§åããããã«äžå¯æ¬ ã§ãããã€ããŒãã©ã¡ãŒã¿ãã¥ãŒãã³ã°ã¯ãæ€èšŒã»ããã§æè¯ã®çµæãéæããããã«ã¢ãã«ã®èšå®ãæé©åããããšãå«ã¿ãŸãã以äžã«æŠç¥çãªã¢ãããŒãã瀺ããŸãïŒ
- æåã«åã ã®åé¡åšããã¥ãŒãã³ã°ããïŒ ãŸããååå¥ã®ããŒã¹åé¡åšã®ãã€ããŒãã©ã¡ãŒã¿ãç¬ç«ããŠãã¥ãŒãã³ã°ããããšããå§ããŸããã°ãªãããµãŒããã©ã³ãã ãµãŒããªã©ã®ææ³ã亀差æ€èšŒãšå ±ã«äœ¿çšããŠãåã¢ãã«ã«æé©ãªèšå®ãèŠã€ããŸãã
- éã¿ãèæ ®ããïŒéã¿ä»ãæç¥šã®å ŽåïŒïŒ scikit-learnã®`VotingClassifier`ã¯ããŒã¹ã¢ãã«ã®æé©åãããéã¿ä»ããçŽæ¥ãµããŒãããŠããŸãããããœããæç¥šã¡ãœããã«éã¿ãå°å ¥ããïŒãŸãã¯ã«ã¹ã¿ã ã®æç¥šã¢ãããŒããäœæããïŒããšãã§ããŸããéã¿ã調æŽããããšã§ãããæ§èœã®è¯ãåé¡åšã«ãã倧ããªéèŠæ§ãäžããã¢ã³ãµã³ãã«ã®æ§èœãåäžãããããšããããŸããæ³šæïŒé床ã«è€éãªéã¿ä»ãã¹ããŒã ã¯éåŠç¿ã«ã€ãªããå¯èœæ§ããããŸãã
- ã¢ã³ãµã³ãã«ã®ãã¥ãŒãã³ã°ïŒè©²åœããå ŽåïŒïŒ ããã€ãã®ã·ããªãªãç¹ã«ã¹ã¿ããã³ã°ãããè€éãªã¢ã³ãµã³ãã«ææ³ã§ã¯ãã¡ã¿åŠç¿åšãæç¥šããã»ã¹èªäœããã¥ãŒãã³ã°ããããšãæ€èšãããããããŸãããããã¯åçŽãªæç¥šã§ã¯ããŸãäžè¬çã§ã¯ãããŸããã
- 亀差æ€èšŒãéµïŒ ãã€ããŒãã©ã¡ãŒã¿ãã¥ãŒãã³ã°äžã¯åžžã«äº€å·®æ€èšŒã䜿çšããŠãã¢ãã«ã®æ§èœã®ä¿¡é Œã§ããæšå®å€ãååŸããåŠç¿ããŒã¿ãžã®éåŠç¿ãé²ããŸãã
- æ€èšŒã»ããïŒ ãã¥ãŒãã³ã°ãããã¢ãã«ã®æçµè©äŸ¡ã®ããã«ãåžžã«æ€èšŒã»ããã確ä¿ããŠãããŸãã
æç¥šåé¡åšã®å®çšçãªå¿çšïŒã°ããŒãã«ãªäºäŸ
æç¥šåé¡åšã¯ãäžçäžã®å¹ åºãç£æ¥ãã¢ããªã±ãŒã·ã§ã³ã§å¿çšãããŠããŸãã以äžã«ããããã®æè¡ãäžçäžã§ã©ã®ããã«äœ¿çšãããŠãããã瀺ãäŸãããã€ã玹ä»ããŸãïŒ
- ãã«ã¹ã±ã¢ïŒ ç±³åœããã€ã³ãã«è³ããŸã§å€ãã®åœã§ãæç¥šåé¡åšã¯å»ç蚺æãäºåŸã«äœ¿çšãããŠããŸããäŸãã°ãè€æ°ã®ç»åè§£æã¢ãã«ãæ£è èšé²åæã¢ãã«ããã®äºæž¬ãçµã¿åãããããšã§ããããªã©ã®çŸæ£ã®æ€åºãæ¯æŽã§ããŸãã
- éèïŒ äžçäžã®éèæ©é¢ã¯ãäžæ£æ€åºã®ããã«æç¥šåé¡åšã掻çšããŠããŸããæ§ã ãªã¢ãã«ïŒäŸïŒç°åžžæ€ç¥ãã«ãŒã«ããŒã¹ã·ã¹ãã ãè¡ååæïŒããã®äºæž¬ãçµã¿åãããããšã§ãäžæ£ãªååŒãããé«ã粟床ã§ç¹å®ã§ããŸãã
- Eã³ããŒã¹ïŒ äžçäžã®Eã³ããŒã¹äºæ¥è ã¯ãè£œåæšèŠã·ã¹ãã ãææ åæã«æç¥šåé¡åšãå©çšããŠããŸããè€æ°ã®ã¢ãã«ã®åºåãçµã¿åãããããšã§ã顧客ã«ããé¢é£æ§ã®é«ãè£œåææ¡ãè¡ãã補åã«å¯Ÿãã顧客ã®ãã£ãŒãããã¯ãæ£ç¢ºã«æž¬å®ããŸãã
- ç°å¢ã¢ãã¿ãªã³ã°ïŒ 欧å·é£åãã¢ããªã«ã®äžéšã®ãããªå°åã§ã¯ã森æäŒæ¡ãæ°Žè³ªãæ±æã¬ãã«ãªã©ã®ç°å¢å€åãç£èŠããããã«ã¢ã³ãµã³ãã«ã¢ãã«ãå©çšãããŠããŸããæ§ã ãªã¢ãã«ã®åºåãéçŽããŠãç°å¢ç¶æ ã®æãæ£ç¢ºãªè©äŸ¡ãæäŸããŸãã
- èªç¶èšèªåŠçïŒNLPïŒïŒ è±åœããæ¥æ¬ãŸã§ã倿§ãªå°åã§ãæç¥šåé¡åšã¯ããã¹ãåé¡ãææ åæãæ©æ¢°ç¿»èš³ãªã©ã®ã¿ã¹ã¯ã«äœ¿çšãããŠããŸããè€æ°ã®NLPã¢ãã«ããã®äºæž¬ãçµã¿åãããããšã§ãããæ£ç¢ºã§å ç¢ãªçµæãéæããŸãã
- èªåéè»¢ïŒ å€ãã®åœãèªåé転æè¡ã«å€é¡ã®æè³ãããŠããŸãïŒäŸïŒãã€ããäžåœãç±³åœïŒãæç¥šåé¡åšã¯ãè€æ°ã®ã»ã³ãµãŒãã¢ãã«ïŒäŸïŒç©äœæ€åºãè»ç·æ€åºïŒããã®äºæž¬ãçµã¿åãããããšã§ãè»äž¡ã®ç¥èŠãæ¹åããé転ã«é¢ããæ±ºå®ãäžãããã«äœ¿çšãããŸãã
ãããã®äŸã¯ãçŸå®äžçã®åé¡ã«å¯ŸåŠããäžã§ã®æç¥šåé¡åšã®æ±çšæ§ãšãæ§ã ãªãã¡ã€ã³ãã°ããŒãã«ãªå Žæã§ã®é©çšå¯èœæ§ã瀺ããŠããŸãã
ãã¹ããã©ã¯ãã£ã¹ãšèæ ®äºé
æç¥šåé¡åšã广çã«å®è£ ããã«ã¯ãããã€ãã®ãã¹ããã©ã¯ãã£ã¹ãæ éã«èæ ®ããå¿ èŠããããŸãïŒ
- ããŒã¿æºåïŒ ããŒã¿ãé©åã«ååŠçãããŠããããšã確èªããŠãã ãããããã«ã¯ãæ¬ æå€ã®åŠçãæ°å€ç¹åŸŽéã®ã¹ã±ãŒãªã³ã°ãã«ããŽãªå€æ°ã®ãšã³ã³ãŒãã£ã³ã°ãå«ãŸããŸããããŒã¿ã®å質ã¯ãã¢ãã«ã®æ§èœã«å€§ãã圱é¿ããŸãã
- ç¹åŸŽéãšã³ãžãã¢ãªã³ã°ïŒ ã¢ãã«ã®ç²ŸåºŠãåäžãããé¢é£æ§ã®é«ãç¹åŸŽéãäœæããŸããç¹åŸŽéãšã³ãžãã¢ãªã³ã°ã¯ããã°ãã°ãã¡ã€ã³ç¥èãå¿ èŠãšããã¢ãã«æ§èœã«å€§ããªåœ±é¿ãäžããå¯èœæ§ããããŸãã
- è©äŸ¡ã¡ããªã¯ã¹ïŒ åé¡ã®æ§è³ªã«åºã¥ããŠé©åãªè©äŸ¡ã¡ããªã¯ã¹ãéžæããŠãã ããããã©ã³ã¹ã®åããããŒã¿ã»ããã«ã¯ç²ŸåºŠãé©ããŠãããããããŸããããäžåè¡¡ãªããŒã¿ã»ããã«ã¯é©åçãåçŸçãF1ã¹ã³ã¢ããŸãã¯AUCãæ€èšããŠãã ããã
- éåŠç¿ã®é²æ¢ïŒ ç¹ã«è€éãªã¢ãã«ãéãããããŒã¿ãæ±ãå Žåã¯ã亀差æ€èšŒãæ£ååãæ©æåæ¢ã䜿çšããŠéåŠç¿ãé²ããŸãã
- è§£éå¯èœæ§ïŒ ã¢ãã«ã®è§£éå¯èœæ§ãèæ ®ããŠãã ãããã¢ã³ãµã³ãã«ææ³ã¯é«ã粟床ãæäŸãããããããŸããããåã ã®ã¢ãã«ãããè§£éãé£ããå ŽåããããŸããè§£éå¯èœæ§ãéèŠãªå Žåã¯ãç¹åŸŽéã®éèŠåºŠåæãLIMEïŒLocal Interpretable Model-agnostic ExplanationsïŒã®ãããªæè¡ãæ€èšããŠãã ããã
- èšç®ãªãœãŒã¹ïŒ ç¹ã«å€§èŠæš¡ãªããŒã¿ã»ãããè€éãªã¢ãã«ãæ±ãå Žåã¯ãèšç®ã³ã¹ãã«æ³šæããŠãã ãããã³ãŒãã®æé©åãé©åãªããŒããŠã§ã¢ãªãœãŒã¹ã®éžæãæ€èšããŠãã ããã
- 宿çãªã¢ãã¿ãªã³ã°ãšååŠç¿ïŒ æ©æ¢°åŠç¿ã¢ãã«ã¯ãæ§èœã®äœäžããªãã宿çã«ã¢ãã¿ãªã³ã°ããå¿ èŠããããŸããæ§èœãç¶æããããã«ãæ°ããããŒã¿ã§ã¢ãã«ãååŠç¿ãããŸããèªåååŠç¿ã·ã¹ãã ã®å°å ¥ãæ€èšããŠãã ããã
é«åºŠãªãã¯ããã¯ãšæ¡åŒµ
åºæ¬çãªæç¥šåé¡åšãè¶ ããŠãæ¢æ±ãã䟡å€ã®ããããã€ãã®é«åºŠãªãã¯ããã¯ãšæ¡åŒµããããŸãïŒ
- éã¿ä»ãæç¥šïŒ scikit-learnã®`VotingClassifier`ã§ã¯çŽæ¥ãµããŒããããŠããŸããããéã¿ä»ãæç¥šãå®è£ ããããšãã§ããŸããæ€èšŒã»ããã§ã®æ§èœã«åºã¥ããŠãåé¡åšã«ç°ãªãéã¿ãå²ãåœãŠãŸããããã«ãããããæ£ç¢ºãªã¢ãã«ãæçµçãªäºæž¬ã«å€§ããªåœ±é¿ãäžããããšãã§ããŸãã
- æç¥šä»ãã¹ã¿ããã³ã°ïŒ ã¹ã¿ããã³ã°ã¯ãã¡ã¿åŠç¿åšã䜿çšããŠããŒã¹ã¢ãã«ã®äºæž¬ãçµã¿åãããŸããã¹ã¿ããã³ã°ã®åŸãã¹ã¿ãã¯ãããã¢ãã«ã®åºåãçµã¿åãããããã«æç¥šåé¡åšãã¡ã¿åŠç¿åšãšããŠæ¡çšããããšã§ãæ§èœãããã«åäžãããå¯èœæ§ããããŸãã
- åçã¢ã³ãµã³ãã«éžæïŒ åºå®ã®ã¢ã³ãµã³ãã«ãåŠç¿ãã代ããã«ãå ¥åããŒã¿ã®ç¹æ§ã«åºã¥ããŠã¢ãã«ã®ãµãã»ãããåçã«éžæããããšãã§ããŸããããã¯ãæé©ãªã¢ãã«ãå ¥åã«ãã£ãŠç°ãªãå Žåã«åœ¹ç«ã¡ãŸãã
- ã¢ã³ãµã³ãã«ã®æåãïŒ å€§ããªã¢ã³ãµã³ãã«ãäœæããåŸãå šäœã®æ§èœã«ã»ãšãã©è²¢ç®ããªãã¢ãã«ãåé€ããŠæåãããããšãå¯èœã§ããããã«ããã粟床ã«å€§ããªåœ±é¿ãäžããããšãªãèšç®ã®è€éããæžããããšãã§ããŸãã
- äžç¢ºå®æ§ã®å®éåïŒ ã¢ã³ãµã³ãã«ã®äºæž¬ã®äžç¢ºå®æ§ãå®éåããæ¹æ³ãæ¢æ±ããŸããããã¯ãäºæž¬ã®ä¿¡é ŒåºŠãçè§£ããç¹ã«ãªã¹ã¯ã®é«ãã¢ããªã±ãŒã·ã§ã³ã§ããæ å ±ã«åºã¥ããæææ±ºå®ãè¡ãã®ã«åœ¹ç«ã¡ãŸãã
çµè«
æç¥šåé¡åšã¯ãæ©æ¢°åŠç¿ã¢ãã«ã®ç²ŸåºŠãšå ç¢æ§ãåäžãããããã®åŒ·åã§æ±çšæ§ã®é«ãã¢ãããŒããæäŸããŸããè€æ°ã®åå¥ã¢ãã«ã®é·æãçµã¿åãããããšã§ãæç¥šåé¡åšã¯ãã°ãã°åäžã®ã¢ãã«ãäžåããããè¯ãäºæž¬ãšä¿¡é Œæ§ã®é«ãçµæã«ã€ãªãããŸãããã®ã¬ã€ãã§ã¯ãæç¥šåé¡åšã®åºæ¬ååãPythonãšscikit-learnã«ããå®è·µçãªå®è£ ããããŠæ§ã ãªç£æ¥ãã°ããŒãã«ãªæèã§ã®å®äžçã§ã®å¿çšäŸãå æ¬çã«æŠèª¬ããŸããã
æç¥šåé¡åšã®æ ãå§ããã«ããããããŒã¿ã®å質ãç¹åŸŽéãšã³ãžãã¢ãªã³ã°ãé©åãªè©äŸ¡ãåªå ããããšãå¿ããªãã§ãã ãããããŸããŸãªããŒã¹åé¡åšã詊ãããã®ãã€ããŒãã©ã¡ãŒã¿ã調æŽããæ§èœãããã«æé©åããããã®é«åºŠãªãã¯ããã¯ãæ€èšããŠãã ãããã¢ã³ãµã³ãã«ã®åãåãå ¥ããããšã§ãæ©æ¢°åŠç¿ã¢ãã«ã®å¯èœæ§ãæå€§éã«åŒãåºãããããžã§ã¯ãã§åè¶ããçµæãéæããããšãã§ããŸããçµ¶ããé²åããæ©æ¢°åŠç¿ã®åéã®æåç·ã«ç«ã¡ç¶ããããã«ãåŠã³ç¶ããæ¢æ±ãç¶ããŠãã ããïŒ